首页> 外文OA文献 >Hybrid acoustic models for distant and multichannel large vocabulary speech recognition
【2h】

Hybrid acoustic models for distant and multichannel large vocabulary speech recognition

机译:用于远程和多声道大词汇量语音识别的混合声学模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We investigate the application of deep neural network (DNN)-hidden Markov model (HMM) hybrid acoustic models for far-field speech recognition of meetings recorded using microphone arrays. We show that the hybrid models achieve significantly better accuracy than conventional systems based on Gaussian mixture models (GMMs). We observe up to 8% absolute word error rate (WER) reduction from a discriminatively trained GMM baseline when using a single distant microphone, and between 4textendash6% absolute WER reduction when using beamforming on various combinations of array channels. By training the networks on audio from multiple channels, we find the networks can recover significant part of accuracy difference between the single distant microphone and beamformed configurations. Finally, we show that the accuracy of a network recognising speech from a single distant microphone can approach that of a multi-microphone setup by training with data from other microphones.
机译:我们调查了深度神经网络(DNN)-隐马尔可夫模型(HMM)混合声学模型在使用麦克风阵列记录的会议的远场语音识别中的应用。我们表明,与基于高斯混合模型(GMM)的常规系统相比,混合模型实现的精度明显更高。当使用单个远距离麦克风时,我们从经过严格训练的GMM基线观察到高达8%的绝对误码率(WER)降低,而在各种阵列通道组合上使用波束成形时,则达到4textendash6%的绝对WER降低。通过训练来自多个通道的音频上的网络,我们发现网络可以恢复单个远距离麦克风与波束成形配置之间的精度差异的重要部分。最后,我们证明,通过训练来自其他麦克风的数据,网络可以识别来自单个远距离麦克风的语音的准确性可以接近多麦克风设置的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号